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Abstract 

Most cancers in humans are large, measuring centimeters in diameter, eomposed of 
many billions of cells. An equivalent mass of normal cells would be highly heterogene¬ 
ous as a result of the mutations that oecur during each cell division. What is remarkable 
about eancers is their homogeneity - virtually every neoplastic cell within a large cancer 
contains the same core set of genetic alterations, with heterogeneity confined to muta¬ 
tions that have emerged after the last clonal expansions. How such clones expand within 
the spatially-constrained three dimensional architeeture of a tumor, and come to domi¬ 
nate a large, pre-existing lesion, has never been explained. We here describe a model for 
tumor evolution that shows how short-range migration and cell turnover can account for 
rapid cell mixing inside the tumor. With it, we show that even a small selective ad¬ 
vantage of a single cell within a large tumor allows the deseendants of that cell to re¬ 
place the precursor mass in a clinically relevant time frame. We also demonstrate that 
the same mechanisms can be responsible for the rapid onset of resistanee to chemother¬ 
apy. Our model not only provides novel insights into spatial and temporal aspeets of 
tumor growth but also suggests that targeting short range cellular migratory activity 
could have dramatie effects on tumor growth rates. 


1. Introduction 

Tumor growth is initiated when a single cell aequires genetie or epigenetie alterations that change 
the cell’s net growth rate (birth minus death) and enable its progeny to outgrow surrounding cells. 
As these small lesions grow, the cells aequire additional alterations that eause them to multiply even 
faster and to change their metabolism to better survive the harsh eonditions and nutrient deprivation. 
This progression eventually leads to a malignant tumor that can invade surrounding tissues and 
spread to other organs. Typical solid tumors contain about 30-70 clonal amino-acid-changing muta¬ 
tions that have aeeumulated during this multi-stage progression [1]. Despite the faet that a majority 
of these mutations are believed to be passengers that do not affect growth, and only -'5% to 10% are 
drivers that provide eells with a small selective growth advantage, a major fraction of the mutations. 
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particularly the drivers, are present in 30%-80% of eells in the primary tumor as well as in metastatic 
lesions derived from it [2,3,4,5]. 

Most attempts at explaining the genetie make-up of tumors assume well-mixed populations of eells 
and do not ineorporate spatial eonstraints [6,7,8,9,10,11]. Several models of the genetic evolution of 
expanding tumors have been developed in the past [12,13,14,15,16,17,18], but they either assume 
very few mutations [12,13,14,15,16] or one- or two-dimensional growth [15,16,17,18]. On the other 
hand, models that ineorporate spatial limitations have been developed to help understand proeesses 
sueh as tumor metabolism [19,20], angiogenesis [21,22,23], and eell migration [24], but these mod¬ 
els ignore geneties. Here, we formulate a model that eombines spatial growth and genetie evolution 
and use the model to deseribe the growth of primary tumors and metastases, as well as the develop¬ 
ment of resistanee to therapeutie agents. 


2. Metastasis 

We first model the expansion of a metastatic lesion derived from a eaneer eell that has escaped its 
primary site (e.g., breast or eoloreetal epithelium) and travelled through the eireulation until it 
lodged at a distant site (e.g., lung or liver). The eell initiating the metastatie lesion is assumed to 
have all the driver gene mutations needed to expand. Motivated by histopathologieal images (Fig. 
la) we model the lesion as a eonglomerate of “balls” of eells. Cells oeeupy sites in a regular three- 
dimensional lattiee (Fig. 6a-b). Cells replieate stochastically with rates proportional to the number of 
surrounding empty sites (non-neoplastie eells or extracellular matrix), hence replieation is faster at 
the edge of the tumor. This is supported by experimental data (Fig. lb,c and Table 1). A eell with no 
eaneer eell neighbors replieates at the maximal rate of h=ln(2)=0.69 days"', equivalent to 24h eell 
doubling time, and a eell that is eompletely surrounded by other eaneer eells does not replicate. Cells 
can also mutate, but we assume all mutations are passengers (they do not confer fitness advantages). 
Upon replieation, a eell moves with a small probability M to a nearby plaee close to the surfaee of 
the lesion and ereates a new lesion. This “sprouting” of initial lesions ean be due to short-range mi- 
gration following an epithelial-to-mesenehymal transition and eonsecutive reversion to the non- 
motile phenotype. Alternatively, it could be the result of another proeess sueh as angiogenesis (See. 
6.1.2) through which the tumor gains better aceess to nutrients. The same model governs the evolu¬ 
tion of larger metastatie lesions that have already developed extensive vaseulature. Cells die with 
rate d independent of the number of neighbors, and are replaeed by empty sites (non-neoplastie eells 
within the loeal tumor environment). 

If there is little dispersal (M « 0), the shape of the tumor beeomes roughly spherieal as it grows to 
large size (Fig. 2a, Fig. 6e and Supp. Video 2). However, even a very small amount of dispersal 
strikingly affeets the predieted shape. For M>0, the tumor forms a eonglomerate of "balls" (Fig. 2b, 
Fig. 6o, and Supp. Video 3), mueh like those observed in aetual metastatie lesions, with the balls 
separated by islands of non-neoplastie stromal eells mixed with extracellular matrix. In addition to 
this remarkable ehange in topology, dispersal strongly impacts the tumor’s growth rate and doubling 
time. Although the size N of the tumor inereases with time T from initiation as ~ without dispersal 
(Fig. 7), it grows exponentially for M>0. This also remains true for long-range dispersal where M 
plays the role of the reseeding probability of new lesions through eireulation from the primary tu¬ 
mor. Using plausible estimates for the rates of eell birth, death, and dispersal probability, we ealeu- 
late that it takes 8 years for a lesion to grow from one eell to a billion eells in the absenee of disper¬ 
sal (M=0), but less than 2 years with dispersal (Fig. 2e). The latter estimate is eonsistent with exper¬ 
imentally determined rates of metastasis growth as well as elinieal experienee, while the eonvention- 
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al model (without dispersal) is not. In sum, even when individual eaneer eells divide and die at iden- 
tieal rates, short-range eell dispersal has a pronouneed effeet on the size and shape of the resulting 
tumor. 


3. Treatment and the evolution of resistance 

Non-spatial models point to the size of a tumor as a oritieal determinant of chemotherapeutie drug 
resistanee [26,27,28,29,30]. To determine whether a spatial model would similarly prediet this de- 
pendeney in a clinieally relevant time frame, we caleulated tumor regrowth probabilities following 
targeted therapies. We assume that the cell that initiates the lesion is susceptible to treatment, other¬ 
wise the treatment would have no effect on the mass, and that the probability of a resistant mutation 
is 10'^ (Sec. 6.1.1); only one such mutation, among all the cells of the metastatic tumor, is needed for 
a regrowth. 

Figure 3 a shows snapshots from a simulation (Supp. Video 1) performed before and after the admin¬ 
istration of a typical targeted therapy. In this simulation, the metastatic lesion grows from its initiat¬ 
ing cell to form a small lesion (~3 mm) when therapy is administered at time J'=0. The size of the 
lesion then rapidly decreases, but one month later resistant clones begin to proliferate and form tu¬ 
mors of microscopic size. Such resistant subclones are predicted to be nearly always present in le¬ 
sions of sizes that can be visualized by clinical imaging techniques [29,30,31]. By six months after 
treatment, the lesions have regrown to their original size. The evolution of resistance is a stochastic 
process - some lesions shrink to zero and some regrow (Fig. 8). Figure 3b,c shows the probability of 
regrowth versus the time from the lesion’s initiation to the onset of treatment upon varying net 
growth rates b-d and dispersal probabilities. Regardless of growth rate, the capacity to migrate 
makes it somewhat more likely that regrowth will occur at shorter periods (compare green and blue 
curves with red curves in Fig. 3b), particularly for more aggressive cancers, i.e., those which have 
higher net growth rates (compare Figs. 3b and 3c). This is in line with recent theoretical work on 
evolving populations of migrating cells [32,33]. If resistant mutations additionally increase the dis¬ 
persal probability before or during treatment, regrowth is faster (Fig. 8b,c). Note that our model as¬ 
sumes the drug is uniformly distributed in the tumor [34]; it is known that drug gradients can speed 
up the onset of resistance [35]. 


4. Primary tumors 

Having shown that the spatial model's predictions are consistent with metastatic lesion growth and 
regrowth times, we turn to primary tumors. Here the situation is considerably more complex because 
the tumor cells are continually acquiring new driver gene mutations that can endow them with fit¬ 
ness advantages over adjacent cells within the same tumor. In contrast, in metastatic lesions we as¬ 
sumed all cells had the same fitness in the absence of targeted therapy. Our model of a primary tu¬ 
mor assumes that it is initiated via a single driver gene mutation that provides a selective growth ad¬ 
vantage over normal neighboring cells. Each subsequent driver gene mutation reduces the death rate 
as d=b{\-sf, where k is the number of driver mutations in the cell (A:>1), and s is the average fitness 
advantage per driver. Identical results are obtained if driver gene mutations increase cell birth rather 
than decrease cell death, or affect both cell birth and cell death (Figs. 9, 11); the only important pa¬ 
rameter is the fitness gain, s, conferred by each driver mutation. 

Figure 4a shows that, in the absence of any new driver mutations (as for a perfectly normal cell 
growing in utero), clonal subpopulations would be distributed in a conical fashion. Each of these 


Page 3 



cones has at least one new genetie alteration, but none of them eonfers a fitness advantage (they are 
"passengers"). In an early tumor, in whieh the eenter eell harbors the initiating driver gene mutation, 
the same eonieal strueture would be observed - as long as no new driver gene mutations have yet ap¬ 
peared. The oeeurrenee of a new driver gene mutation, however, dramatieally alters the spatial dis¬ 
tribution of eells. In partieular, the heterogeneity observed in normal eells (Fig. 4a) is substantially 
redueed (Fig. 4b and Supp. Video 5). The degree of heterogeneity ean be quantified by ealculating 
the fraetion of genetie alterations (GAs, passengers plus drivers) shared between two eells separated 
by various distanees (Fig. 4d-f). The homogeneity is markedly inereased (eompare blue vs. red 
curves in Fig. 4e), even with relatively small fitness advantages (s=l%). This also has implieations 
for the number of GAs that will be present in a macroseopic fraetion (e.g., >50%) of all eells. Figure 
4f shows that this number is many times larger for s=l% than s=0%. Furthermore, our model pre¬ 
diets that virtually all eells within a large tumor will share at least one new driver gene mutation after 
5 years of growth (Fig. 11). The faster the clonal expansion oecurs (the larger s is), the smaller the 
number of passenger GAs (Fig. lid). Our results are also robust to ehanges to the model (See. 6.4 
and Figs. 11, 12). We stress that an important prerequisite for homogeneity is eell turnover in the 
tumor, beeause in the spatial setting eells with driver mutations ean “pereolate” through the tumor 
only if they replaee other eells. In the absenee of eell turnover, tumors are mueh more heterogeneous 
(Fig. I2d). 


5, Conclusion 

Our model for the spatial evolution of eaneers aceounts for many faets observed elinieally and ex¬ 
perimentally. Our results are robust and many assumptions (replieation rate proportional to the num¬ 
ber of surrounding normal cells, drivers lowering the death rate, eonstant dispersal rate) ean be re¬ 
laxed without qualitatively affeeting the outeome (See. 6.4 and SI). Though tumor eell migration has 
historieally been viewed as a feature of eaneer assoeiated with late events in tumorigenesis, sueh as 
invasion through basement membranes or vaseular walls, this elassieal view of migration pertains to 
the ability of eaneer eells to migrate over large distanees [36]. Instead, we foeus on small amounts of 
eellular movement and show that they are able to dramatieally reshape a tumor. Moreover, we pre¬ 
diet that the rate of tumor growth ean be substantially altered by a ehange in dispersal rate of the 
eaneer eells, even in the absenee of any changes in doubling times or net growth rates of the eells 
within the tumor. Some of our predietions eould be experimentally tested using new cell labelling 
techniques [37,38]. Virtually all previous explanations for the effeets of driver gene mutations have 
assumed that the differenee between eell birth and eell death are the major, if not sole, determinants, 
of eaneer growth. Our results suggest that small differenees in eell migration ean have equally pow¬ 
erful effects on cancer growth - not on individual eells, but on masses of eells in the three- 
dimensional patterns that aetually oeeur in vivo. This idea eould greatly inform the interpretation of 
mutations in genes whose main functions seem to be related to the eytoskeleton or to cell adhesion 
rather than to eell birth, death, or differentiation [39,40,41]. For example, eells that have lost the ex¬ 
pression of e-eadherin (a eell adhesion protein) are more migratory than normal eells with intaet e- 
eadherin expression [42], and loss of e-eadherin in panereatic eaneer has been assoeiated with poorer 
prognosis [43], in line with our predietions. 
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6, Methods 

6,1. Spatial model for tumor evolution. 

Tumor modelling has a long tradition [44], Many models of spatially expanding tumours were 
proposed in the past [13,14,15,16,17,18,21,45,46,47,48], but they either assume very few 
[13,15,45,47,49,50,51] or no new mutations at all [14,19,20,46,52,53], or one- or two-dimensional 
growth [14,15,16,17,54,55]. On the other had, well-mixed models with multiple mutations 
[6,8,56,57] do not often include space, and more biologically realistic computational models 
[19,24,58] require too much computing resources (time and memory) to simulate tumors of 
clinically relevant size (N-IO^ cells). Our model builds upon the Eden lattice model [59] and 
combines spatial growth and accumulation of multiple mutations. Since we focus on the interplay of 
genetics, spatial expansion, and short-range dispersal of cells, for simplicity we do not explicitly 
model metabolism [18], tissue mechanics, spatial heterogeneity of tissues, different types of cells 
present, or angiogenesis [21]. 

A tumor is made of non-overlapping balls (microlesions) of cells. Tumor cells occupy sites of a 
regular 3d square lattice (Moore neighborhood, 26 neighbors). Empty lattice sites are assumed to be 
either normal cells or filled with extracellular matrix and are not modeled explicitly. Each cell in the 
model is described by its position and a list of genetic alterations (GAs) that have occurred since the 
initial neoplastic cell, and the information whether a given mutation is a passenger, driver, or 
resistance-carrying mutation. A passenger mutation does not affect the net growth rate whereas a 
driver mutation increases it by disrupting tight regulation of cellular divisions and shifts the balance 
towards increased proliferation or decreased apoptosis. The changes can also be epigenetic and we 
do not distinguish between different types of alterations. We assume that each GA occurs only once 
(“infinite allele model” [60]). The average numbers of all GAs, driver, and resistant GAs produced 
in a single replication event are denoted respetively by y, ya, and yr. When a cell replicates, each of 
the daughter cells receives n new GAs of each type {n being generally different in both cells) drawn 
at random from the Poisson probability distribution 

-y 72 / 

e X (y 12) 

Pin)= -( 1 ) 


where “x” denotes the type of GA. 

In Model A shown in Eigs. 2-4, replication occurs stochastically with rate proportional to the number 
of empty sites surrounding the replicating cell, and death occurs with constant rate depending only 
on the number of drivers. We also simulated other scenarios (Models B, C, D, see below). Driver 
mutations increase the net growth rate (the difference between proliferation and death) either by in¬ 
creasing the birth rate or decreasing the death rate by a constant factor 1+5 where 5>0. 

Dispersal is modelled by moving an offspring cell to a nearby position where it starts a new micro¬ 
lesion (Eig. 5a). Microlesions repel each other; a “shoving” algorithm [61,62] (Eig. 5b) ensures they 
do not merge. 

The computer code can handle up to 10^ cells, which corresponds to tumors that are clinically mean¬ 
ingful and can be observed by conventional medical imaging (diameter >lcm). The algorithm is dis¬ 
cussed in details in the Supplementary Information. It is not an exact Kinetic Monte Carlo (KMC) 
algorithm because such an algorithm would be too slow to simulate large tumors. A comparison with 
KMC for smaller tumors (Supplementary Information and Eig. 14) shows that both algorithms pro¬ 
duce consistent results. 
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6.1.1. Model parameters 

The initial birth rate h=ln 2=0.69 days'^ which corresponds to a 24h minimum doubling time. The 
initial death rate J=0...0.9956 depends on the aggressiveness of the tumor (larger values = less ag¬ 
gressive lesion). In simulations of targeted therapy, we assume that, prior to treatment, 6=0.69 days"' 
and J=0.56=0.35 days'\ whereas during treatment 6=0.35 days’' and J=0.69 days’', i.e. birth and 
death rates swap places. This rather arbitrary choice leads to the regrowth time of about 6 months 
which agrees well with clinical evidence. Mutation probabilities are y= 0.02, Yd=4-10’^, Yr'^lO in line 
with experimental evidence and theoretical work [8,63,64,65]. Since there are no reliable data on the 
dispersal probability M, we have explored a range of values between M=10’ and 10’ . All parame¬ 
ters are summarized in Table 2, see also further discussion in Supplementary Information. 

6.1.2. Validity of the assumptions of the model 

Our model is deliberately oversimplified. However, many of the assumptions we make can be exper¬ 
imentally justified or shown not to qualitatively affect the model. 

Three-dimensional regular lattice of cells. The 3d Moore neighborhood was chosen because it is 
computationally fast and introduces relatively fewer artifacts related to lattice symmetries. Real tis¬ 
sues are much less regular and the number of nearest neighbors is different [66]. However, recent 
simulations of similar models of bacterial colonies [67,68] show that the structure of the lattice (or 
the lack thereof in off-lattice models) has a marginal effect on genetic heterogeneity. 

Asynchronous cell division. Division times of related cells remain correlated for a few generations. 
However, stochastic cell division implemented in our model is a good approximation for a large 
mass of cells and is much less computationally expensive than modelling a full cell cycle. 

Replication faster at the boundary than in the interior. Several studies have described a higher 
proliferation rate at the leading edge of tumors, and this has been associated with a more aggressive 
clinical course [69]. To estimate the range of values of death rate d for our model, we used the pro¬ 
liferation marker Ki-67. Representative formalin-fixed, paraffin-embedded tissue blocks were se¬ 
lected from 4 small chromophobe renal cell carcinomas and 6 small hepatocellular carcinomas by 
the pathologist (M.E.P). A section of each block was immunolabeled for Ki67 using the Ventana 
Benchmark XT system. Eight to 12 images, depending on the size of the lesion, were acquired from 
each tumor. Eields were chosen at random from the leading edge and the middle of the tumor and 
were not necessarily “hot spots” of proliferative activity. Using an Image! macro, each Ki-67 posi¬ 
tive tumor nucleus was labeled red by the pathologist, and each Ki67 negative tumor nucleus was 
labeled green. Other cell types (endothelium, fibroblasts, inflammatory cells) were not labeled. The 
proliferation rate was then calculated using previously described methods [70]. The study was ap¬ 
proved by the Institutional Review Board of the Johns Hopkins University School of Medicine. In all 
ten tumors the proliferation rate at the leading edge of the tumor was greater than that at the center 
by a factor of 1.25 to 6 (Table 1). Comparing the density of proliferating cells to our model gives 
d~Q.5b (range: <7=0.176... 0.86), which is what we assume in the simulations of aggressive lesions. 

Equal fitness of all cells in metastatic lesions. We assume that cells in a metastatic lesion are al¬ 
ready very fit since they contain multiple drivers. Experimental evidence in microbes [71] and (to a 
lesser extent) in eukaryotes [72] suggests that fitness gains due to individual mutations are largest at 
the beginning of an evolutionary process and that the effects of later mutations are much smaller. 
Consequently, new drivers occurring in the lesion will not be able to spread through the population 
before the lesion reaches a clinically relevant size. Indeed, studies of primary tumors and their 
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matched metastases usually fail to find driver mutations present in the metastases that were not pre¬ 
sent in the primary lesions [2,73]. 

Dispersal. In our model, eells detaeh from the lesion and attaeh again at a different loeation in the 
tissue. This ean be viewed either as eells migrating from one plaee to another one, or as a more ge- 
nerie mechanism that allows tumor cells to get better aeeess to nutrients by dispersing within the tis¬ 
sue, henee providing a growth advantage over eells that did not disperse. Some meehanisms that do 
not involve active motion (i.e. eells beeoming motile) are discussed below. 

Migration. Cancer eells are known to undergo epithelial-to-mesenchymal transition (EMT), whose 
origin is thought to be epigenetic [25]. This involves a eell beeoming motile and moving some dis- 
tanee. If the cell finds the right environment, it ean switeh baek to the non-motile phenotype and 
start a new lesion. Motility ean be enhaneed by tissue fluidization due to replieation and death [74]. 
Instead of modelling the entire eyele (epithelial-mesenchymal-epithelial), we only model the final 
outeome (a eell has moved some distanee). 

Tumor buds. Many tumors exhibit foeally invasive eell elusters, also known as tumor buds. Their 
proliferation rate is less than that of eells in the main tumor [75]. We hypothesize that tumor buds 
eontain cells that have not yet eompleted EMT and therefore they proliferate slower. 

Single versus cluster migration. Ref [76] found that eirculating eaneer eells (CRCs) ean travel in 
elusters of 2-50 eells and that such CRC clusters ean initiate metastatic foci. They report that approx¬ 
imately half of the metastatie foei they examined were initiated by single CRCs, and that CRC elus¬ 
ters initiated the other half. The authors also note that the cells forming a eluster are most likely 
neighboring eaneer eells from the primary tumor. This means that the genetie makeup of cells within 
a newly established lesion will be very similar, regardless of its origin (single eell versus a small 
eluster of eells). Therefore the ability to travel in elusters should not affect the genetie heterogeneity 
or regrowth probability as eompared to single-oell dispersal from our model. 

Angiogenesis. We do not explieitly model angiogenesis for two reasons. Eirst, most GAs that ean 
either ehange the growth rate or be detected experimentally must oecur at early stages of tumor 
growth as explained before. Henee, the genetie make-up of the tumor is determined primarily by 
what happens before angiogenesis. Seeond, loeal dispersal from the model mimies tumor eells inter¬ 
spersing with the vascularized tissue and getting better aeeess to nutrients, which is one of the out- 
eomes of angiogenesis. 

Biomechanics of tumors. Growth is affected by the mechanical properties of cells and the extracel¬ 
lular matrix. We do not explicitly include biomechanics, in contrast to more realistic models [77,78], 
as this would not allow us to simulate lesions larger than about 10^ cells. Instead, we take experi¬ 
mentally determined values for birth and death rates, values that are affected by biomechanics, as the 
parameters of our model. 

Isolated balls of cells. In our simulations, balls of cells are thought to be separated by normal, vas¬ 
cularized tissue which delivers nutrients to the tumor. Each ball’s environment is the same and there 
are no interactions between the balls other than mechanical repulsion. This represents a convenient 
mathematical contrivance and qualitatively recapitulates what is observed in stained sections of ac¬ 
tual tumors (Eig. la). However, in reality microlesions within the primary tumor are neither spheri¬ 
cal nor completely separated, and some of them are better described as “protrusions” bulging out 
from the main tumor tissue due to biomechanical instabilities [79]. Nevertheless, we believe that 
modelling the tumor as a collection of non- or weakly-interacting microlesions is essentially correct. 
Many tumor cells are less adhesive and more elastic than normal cells [80], and it is known that dif- 
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ferences in cellular adhesion and stiffness promote segregation of different types of eells [81,82], 
Thus, if a cancer cell migrates far enough from the surface of the lesion, its progeny will form a clus¬ 
ter of cells rather than mix with the normal tissue. For the same reason, and unlike in the model, 
nearby microlesions may partially merge. However, the capillary network of blood vessels - either 
due to tumor angiogenesis or preexisting in the invaded tissue - may still provide enough nutrients to 
the lesions so that our assumption of independently growing balls of eells remains valid. 


6.2, Tumor geometry and heterogeneity in the absence of driver mutations 

Supplementary videos 2 and 3 illustrate the process of growth of a tumor with maximally A^=10^ 
eells, forM=0 andM=10'^, respectively, and for J=0.5. Figure 6 shows snapshots from a single sim- 
ulation for M=0, A^~10 , and J=0 (no death, panel a) and <i=0.9 (panel b). In the latter case, cells are 
separated by empty sites (normal cells/extracellular matrix). Panel (c) shows that the tumor is almost 
spherically symmetric for M=0. The symmetry is lost for small but non-zero M, and restored for 
larger M when the balls become smaller and their number increases. Panel (c) also shows that meta¬ 
static tumors contain many clonal sectors with passenger mutations. Figure 10a shows that the frac¬ 
tion G{r) of GAs that are the same in two randomly sampled cells (cf Fig. 4) separated by distance r 
quiekly deereases with r, indicating increased genetic heterogeneity due to passenger mutations. 

6.3, Targeted therapy of metastatic lesions 

Models of cancer treatment [29,30,34,83,84,85] often assume either no spatial structure or do not 
model the emergenee of resistanee. We assume that the eell that initiated the lesion was sensitive to 
treatment but its progeny may become resistant. Before the therapy commences, all cells have the 
same birth and death rates, but after the treatment resistant cells continue to proliferate with the same 
rate whereas suseeptible cells are assigned different rates as deseribed above. Resistant cells can 
emerge prior to and during the therapy. 

Supplementary Video 1 and Figure 8a show that, since the proeess of resistanee aequisition is sto- 
chastie, some tumors regrow after an initial regression, and some do not. If only resistant cells can 
migrate, regrowth is faster (Fig. 8b,c). Figure 8d-g shows regrowth probabilities Rregrowth for different 
treatment scenarios not mentioned in the main text, depending on whether the drug is cytostatic 
(^treatment =0) or cytocidal (^/treatment = b), and whether d=0 or d>0 before treatment. In Panel (d), cells 
replicate and die only on the surface, and the core is “quiescent” - cells are still alive there but cannot 
replicate unless outer layers are removed by treatment (Supp. Video 6 and 7). Rregrowth does not de- 

O 

pend on the migration rate M at all, and is close to 100% for V>10 eells, a size that is larger than for 
d>0 (Panel (f)). It can be shown that Rregrowth = I-exp(-YrA0- Panel (e) is for the cytostatic drug (htreat- 
ment=<^treatment=0); this is also equivalent to the cytoeidal drug if the tumor has a neerotic eore (eells 
are dead but still occupy physical volume). In this case, Rregrowth increases with M because more re¬ 
sistant cells are on the surface for larger M (cells can replicate only on the surface in this scenario). 
Figure 8f,g shows models with eell death present even in the absence of treatment {d=Q.9b) but oc¬ 
curring only at the surface, unlike in Fig. 3 where cells also die inside the tumor. Death increases 
F’regrowth due to a larger number of cellular division neeessary to obtain the same size, and hence more 
opportunities to mutate. 


6,4, Relaxing the assumptions of the model 
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Figure 4 shows that even a small fitness advantage substantially reduees genetic diversity through 
the process of clonal expansion, see also Suppl. Videos 4 and 5. We now demonstrate that this also 
applies to modified versions of the model, proving its robustness. 

Only the net growth rate matters. Figure 9 shows that an advantageous mutation with 5=1% 
spreads through the tumor regardless of whether drivers affect death (upper panels) or growth (lower 
panels), also in the presence of non-zero migration. This is further confirmed in Fig. 10b,e, which 
shows that the average number of shared GAs is larger in the presence of drivers. Figure 10c,f shows 
that as long as 5>0 and regardless of its exact value, driver mutations homogenize the tumor com¬ 
pared to the case 5=0. Figure 1 la-c shows how many driver mutations are expected to be present in a 
randomly chosen cell from a tumor that is T years old. Neither migration nor the way drivers affect 
growth (via birth or death rate) has significant effect on the number of drivers per cell (Panels b,c). A 
small discrepancy visible in Panel (b) is caused by a slightly asymmetric way death and birth is 
treated in our model, see the Supplementary Information. 

Model B. In this modified model, cells replicate with constant rate if there is at least one empty 
neighbor. In the absence of drivers, GAs are distributed evenly throughout the lesion (Fig. 12b) but 
they often occur independently and the number of frequent GAs is low (Fig. 12e). Drivers cause 
clonal expansion as in Model A. 

Model C. Cells replicate regardless of whether there are empty sites surrounding them or not. When 
a cell replicates, it pushes away other cells towards the surface (Supplementary Information). Figure 
I2c,e shows that this again leads to clonal expansion which decreases diversity. 

Model D. Replication/death occurs only on the surface and the core of the tumor is static. Figure 12d 
shows that driver mutations cannot spread to the other side of the lesion and conical clonal sectors 
can be seen even for 5>0. The number of frequent GAs is the same for 5=0 and 5=1% indicating that 
genetic heterogeneity is not lowered by clonal expansion. This demonstrates that cell turnover inside 
the tumor is very important for reducing heterogeneity. To obtain the same (low) heterogeneity as 
for Models A-C, the probability of driver mutations must be much larger in Model D (Fig. 12f). 

Drivers affecting M, We investigated three scenarios in which drivers affect (1) only the migration 
rate M-^(1+^)M where q>0 is the “migration fitness advantage” (no change in b,d), (2) both Mand 
d, i.e. {d,M)^{d{\-s),{^+q)M) with 5,^>0, (3) either M or d, with probability 1/2. Figure 13 shows 
that growth is unaffected in cases (1,3) compared to the neutral case. For (2) the tumor growth rate 
increases significantly when the tumor is larger than A=10^ cells. This shows that migration increas¬ 
es the overall fitness advantage, in line with Ref. [86] which shows that fixation probability is de¬ 
termined by the product of the exponential growth rate and diffusion constant (motility) of organ¬ 
isms. 
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Fig. 1. Structure of solid tumors, (a) Hematoxylin and eosin stained section of hepatocellular carcinoma. 
The neoplasm is composed of two distinct balls of cells (*) separated by non-tumor tissue (^). Note the small¬ 
er nests (balls) of neoplastic cells within the lower lesion, (b-c) Hepatocellular carcinoma immunolabeled 
with the proliferation marker Ki67. The middle of the tumor (b) shows decreased proliferation as compared to 
the leading edge of the tumor (top of c), see Table 1. The red circle in (b) demonstrates an example of cells 
(inflammatory cells) that would not have been included in the total cell count. The areas below the red line in 
(c) are non-tumor stroma and non-tumor liver tissue. 
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Fig. 2. Short-range migration affects size, shape, and growth rate of tumors, (a) A spherical lesion in the 
absence of migration (M=0) and (b) a conglomerate of lesions, each initiated by a cell that has migrated from 
a prior lesion, for low but non-zero migration (M=10'^). Colors reflect the degree of genetic similarity; cells 
with similar colors have similar genetic alterations (GAs). The death rate is (i=0.8Z> which corresponds to a net 
growth rate of 0.2Z?=0.14days'’, and A=10’ cells, (c) The number of cancer cells N increases in time much 
faster for M>0 than in the absence of migration. 
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b net growth rate 0.5b = 0.34days‘i 



net growth rate O.lh = 0.07days‘^ 



Fig. 3. Treatment success rates depend on the net growth rate of tumors, (a) Time snapshots prior to and 
during therapy (“m” = months). Treatment increases the death rate and decreases the growth rate of suscepti¬ 
ble cells throughout the tumor. Resistant subpopulations that cause the tumor to regrow after treatment can be 
seen at r=lm. Parameters: d=0.5b in the absence of treatment, M=10'®; treatment begins when the tumor has 
A^=10^ cells. (b-c) Probability of tumor regrowth (Pregrowth) as a function of time after treatment initiation, for 
different migration probabilities (M) and net growth rates of the resistant cells. A higher growth rate (b) (ei¬ 
ther because of a lower death rate d or equivalently a higher birth rate b) leads to a high regrowth probability, 
so that 50% of tumors regrow six months after treatment is initiated when M=10'^. (c) Tumors with lower net 
growth rates (i.e., higher death rates) require >20 months to achieve the same probability of regrowth. In both 
(b) and (c) we assume that the growth rate of resistant cells after therapy is identical to that of the sensitive 
cells prior to treatment. The death rate of sensitive cells during treatment is greater than the birth rate, or the 
tumors would not be sensitive to the drug. 
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Fig. 4. Genetic diversity is strongly reduced by the emergence of driver mutations. Genetic diversity 
within a single lesion (M=0) with initial death rate d=0.99b (net growth rate = O.OOVdays’'). The three most 
abundant genetic alterations (GAs) have been color-coded using red, green, and blue (see panel c). Combina¬ 
tions of the three basic colors correspond to cells having 2 or 3 of these GAs. (a) No drivers - separated, coni¬ 
cal sectors emerge in different part of the lesion, each corresponding to a different clone, (b) Drivers with fit¬ 
ness advantage s=l% lead to clonal expansions and many cells have all 3 GAs (white area), (d) Drawing 
shows how genetic diversity can be determined quantitatively: genotypes of two randomly chosen cells, sepa¬ 
rated by distance r, are compared and the number of shared GAs is determined, (e) The number of shared 
GAs versus the normalized distance r/<r> decreases much slower for the case with (red) than without driver 
mutations (blue), (f) The total number of GAs present in at least 50% of all cells is much larger for s=l% than 
for s=0%. 
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Fig. 5. Details of the model, (a) A sketch showing how dispersal is implemented: (1) A ball of cells of radius 
Ri which center is at Xi is composed of tumor cells and normal cells (blue and empty squares in the zoomed-in 
rectangle (2)). A cell at position x, w.r.to the center of the ball attempts to replicate (3). If replication is suc¬ 
cessful, the cell migrates with probability M and creates a new microlesion (4). The position Xj of this new 
ball of cells is determined as the endpoint of the vector which starts at Xi and has direction x, and length Ri. (b) 
Overlap reduction between the balls of cells. When a growing ball begins to overlap with another ball (red), 
they are both moved apart along the line connecting their centers of mass (green line) by as much as necessary 
to reduce the overlap to zero. The process is repeated for all overlapping balls as many times as needed until 
there is no overlap. 
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Fig. 6. Simulation snapshots, (a-b) A few snapshots of tumor growth for no dispersal and (a) d=0 (b) 
d=0.9b. To visualize clonal sectors, cells have been color-coded by making the color a heritable trait and 
changing each of its RGB components by a small random fraction whenever a cell mutates. The initial cell is 
gray. Empty space (white) are stromal cells mixed with extracellular matrix. Note that images are not to scale, 
(c) Tumor shapes for A=10^, d=0.9b, and different dispersal probability M. Images not to scale - the tumor for 
M=10'^ is larger than the one for M=0. 
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Fig. 7. Tumor size as a function of time, (a) Growth of a tumor without dispersal (M=0), for d=0.Sb. For 
large times (7), the number of cells grows approximately as const^r^. The tumor reaches size A^=10^ cells 
(horizontal line) after about 100 months (8 years) of growth, (b) The same data is plotted in the linear scale, 
with replaced by “linear extension” 
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Fig. 8. Simulation of targeted therapy, (a-c) The total number of cells in the tumor (black) and the number 
or resistant cells (red) versus time, during growth (r<0) and treatment (T>0), for ~100 independent simula¬ 
tions, for d=0.5b for r<0. Therapy begins when A^=10^ cells. Upon treatment, many tumors die out (A^ de¬ 
creases to zero) but those with resistant cells will regrow sooner or later, (a) M=0 for all cells at all times, (b) 
M=0 for all cells for T<0 and M=10'"^ for resistant cells for T>0. (c) M=0 for non-resistant and M=10'^ for 
resistant cells at all times. In all three cases. Pregrowth is very similar: 36+/-5% for (a), 25+/-4% for (b), and 
21+1-6% for (c). (d-g) Regrowth probability for four treatment scenarios not discussed in the main text. Data 
points correspond to three dispersal probabilities: M=0 (red), M=10'^ (green), and M=\6'^ (blue). The proba¬ 
bility is plotted as a function of tumor size Adjust before the therapy commences, (d) Before treatment, cells 
replicate only on the surface. Cells in the core are quiescent and do not replicate. Therapy kills cells on the 
surface and cells in the core resume proliferation when liberated by treatment, (e) As in (d) but drug is cyto¬ 
static and does not kill cells but inhibits their growth. The results for Pregrowth are identical if the drug is cyto¬ 
toxic and the tumor has a necrotic core (cells die inside the tumor and cannot replicate even if the surface is 
removed), (f) Before treatment, cells replicate and die on the surface. The core is quiescent. Therapy kills 
cells on the surface (cytotoxic drug), (g) As in (f) but therapy only inhibits growth (cytostatic drug). 
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Fig. 9. Most abundant GAs in the presence of drivers and dispersal. Sections through the center of the 
tumor (layer of thickness of 80 cells and symmetric about the center) for A=10^, dispersal probability M=10'^, 
with three most abundant mutations color-coded as red, green, and blue (as in Fig. 4). Three columns corre¬ 
spond to three representative independent simulation runs. A small color-mixing palette in the top-left comer 
explains how these three mutations combine to produce other colors. Upper panel: drivers decrease the rate of 
death (as in the main text). Lower panel: drivers increase the rate of growth. Middle panel: no driver muta¬ 
tions (neutral case with 5=0% and d=Q.99b). 
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Fig. 10. Genetic diversity quantified, (a) Tumors are much more genetically heterogeneous in the absence of 
driver mutations (s=0). The plot shows the fraction G(r) of genetic alterations (GAs) shared between the cells 
as function of their separation (distance r) in the tumor. The fraction quickly decreases with increasing r. The 
distance in the figure is normalized by the average distance <r> between any two cells in the tumor. For a 
spherical tumor, <r> is approximately equal to half of the tumor diameter, (b) Fraction of shared GAs for 
5=1% and 5=0%, A=10^, and M =10’’. In the presence of drivers. Gif) decays slower, indicating more homo¬ 
geneous tumors, (c) The exact value of the selective advantage of driver mutations is not important (all curves 
GAs(r) look the same, except for 5=0) as long as 5>0. (d,e,f): Number of GAs present in at least 50% of cells 
for identical parameters as in panels (a,b,c), correspondingly. Drivers substantially increase the level of genet¬ 
ic homogeneity. 
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Fig. 11. Accumulation of driver and passenger GAs. (a-c) The number of drivers per cell in the primary 
tumor plotted as a function of time, (a) M=0 and three different driver selective advantages. For 5=1%, cells 
accumulate on average one driver mutation within 5 years. The time can be significantly lower for very strong 
drivers (5>1%). (b) The rate at which drivers accumulate depends mainly on their selective advantage and not 
on whether they affect death or birth rate, (c) Dispersal does not affect the rate of driver accumulation, (d-e) 
The number of passenger mutations (PMs) per cell versus the number of driver mutations per cell. More PMs 
are present for smaller driver selective advantage (panel d) and this is independent of the dispersal probability 
M (panel e) in the regime of small M. Data points correspond to independent simulations. 
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Fig. 12. Genetic diversity in a single lesion for different models, (a-d) Representative simulation snapshots, 
with GAs color-coded as in Fig. 4. Upper panels: 5=0, lower panels: 5=1%. Panel (a): Model A from the main 
text in which cells replicate with rates proportional to the number of empty nearby sites. Panel (b): Model B - 
the replication rate is constant and non-zero if there is at least one empty site nearby, and zero otherwise. Pan¬ 
el (c): Model C - cells replicate at a constant rate and push away other cells to make space for their progeny. 
Panel (d): Model D - cells replicate/die only on the surface, the interior of the tumor (“necrotic core”) is static. 
In all cases A=10^ d=0.99b. (e) Number of GAs present in at least 50% of cells for identical parameters as in 
panels a-d. In all cases except surface growth (d), drivers increase genetic homogeneity, measured by the 
number of frequent GAs. (f) Model D with yd =2x10“'^ instead 4x10'^, i.e., drivers occur 5x more often. In this 
case, driver mutations arise earlier than in (d) and the tumor becomes more homogeneous. 
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T [months] 

Fig. 13. Tumor size versus time when drivers affect the dispersal probability. In all cases, d=0.9b, and (1, 
black) drivers increase the dispersal rate 10-fold (q=9) but have no effect on the net growth rate, (2, red) driv¬ 
ers increase both the net growth rate (5=10%) and M, (3, green) drivers either (with probability 1/2) increase 
M 10-fold {q=9) or increase the net growth rate by 5=10%, (4, blue) drivers increase only the net growth rate 
by 5=10%, (5, black dashed line) neutral case with M=10'^, indistinguishable from (1). In all cases (1-3) the 
initial dispersal probability M=10'^. 
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Fig. 14. Comparison between the algorithm from the main text and the kinetic-Monte Carlo (KMC) 
algorithm (Supplemental Information), (a) In the absence of death, both models predict identical growth, 
(b) Death makes the model differ for large times, (c) The difference almost disappears if the death rate is re¬ 
scaled as c/—>0.897c/ in the KMC algorithm, also in the presence of drivers, (d) The number of drivers as a 
function of time differs slightly between the two models. 
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CASE 

TUMOR TYPE 

EDGE 

KI % 

EDGE 

Total no. of cells 

CENTER 
KI % 

CENTER 

Total no. of cells 

RATIO 

center:edge 

1 

CHROMOPHOBE 

RCC 

3.46 

1013 

1.11 

2561 

0.32 

2 

CHROMOPHOBE 

RCC 

3.07 

508 

1.05 

938 

0.34 

3 

CHROMOPHOBE 

RCC 

2.63 

524 

0.44 

697 

0.17 

4 

CHROMOPHOBE 

RCC 

1.58 

581 

0.53 

958 

0.34 

5 

HCC 

17.14 

892 

9.74 

1637 

0.57 

6 

HCC 

51.71 

1079 

32.84 

2562 

0.64 

7 

HCC 

47.37 

435 

19.97 

1397 

0.42 

8 

HCC 

19.02 

895 

13.78 

1191 

0.72 

9 

HCC 

15.09 

1074 

11.98 

1094 

0.79 

10 

HCC 

29.84 

1305 

20.87 

2457 

0.70 


Table 1. Experimental results for the percentage of proliferating cells in the center versus at the edge of 
solid tumors. A representative section of each tumor was labeled for the proliferation marker Ki67 (KI), and 
images of the tumor at the leading edge and the center were acquired as described (Sec. 6.1.2). Proliferation is 
markedly increased at the leading edge. The average ratio of the number of proliferating cells in the center/at 
the edge is 0.50 (range 0.17-0.79). 
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Parameter 

Meaning 

Value 

b 

Birth rate 

In 2 = 0.69 days'* 

d 

Death rate 

0 ... 0.995-6 

s 

Selective advantage 

0 ... 0.1 (=0 ... 10%) 

Y 

Mutation probability (all GAs) 

0.02 

Yd 

Mutation probability (drivers only) 

4-10'^ 

Yr 

Mutation probability (resistance-carrying mutations) 

lO'* 

M 

Dispersal probability 

0 ... 10'" 


Table 2. Summary of all parameters used in the model. If, for a given parameter, many different values 
have been used in different plots, a range of values used is shown. Birth and death rates can also depend on 
the number of driver mutations, see Sec. 6.1. 
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SUPPLEMENTAL INFORMATION 


Bartlomiej Waclaw, Ivana Bozic, Meredith E. Pittman, Ralph H. Hruban, Bert 

Vogelstein, and Martin A. Nowak 

1 Computer algorithm 

In each time step, a random cell is chosen, its fate (replication or death) decided, and the time 
variable updated to account for the elapsed time since the last event. More specifically, for Model 

A, 

1. A random tumor cell i is chosen. Let its genotype be g. 

2. A nearest neighbor of cell i is picked at random and if it is empty, cell i replicates there 
and creates a new cell j with probability proportional to the birth rate bg of cell i divided 
by the maximal birth rate 6 max of all cells present in the tumor. This ensures that the 
replication probability is always less or equal to one, and that it is proportional to the 
number of empty neighbors of L In the case of no replication (site not empty or rejected), 
proceed to step 5. 

3. Each of the two cells i,j receive n* and rij new mutations, with n*, nj drawn independently 
from the Poisson probability distribution (Eq. (1) in Sec. 6.1). If either ni,nj is zero, the 
genotype of the corresponding cell does not change. 

4. With probability M, the new cell j migrates and forms a new ball (microlesion) in the 
proximity of the surface of the ball from which it originates. The 3d position Xj of the 
new ball is determined as Xj = Aj + RiXi/\xi\, where Xi denotes the position of the ball 
of cell i, Ri is the radius of that ball, and Xi is the position of cell i relative to the center 
of the ball, see Fig. 5a. 

5. Cell i dies with probability equal to its death rate dg divided by the maximal death rate 
dmax of all existing tumor cells. If all cells within a ball die, the ball is removed. 

6 . Time is updated by a small increment dt = l/(6max-N), where N is the total number of 
cells in the tumor. This ensures that b has the correct interpretation of probability per 
unit time, and the number of cells in an exponentially growing population (for example, 
for large M) increases as N{t) = exp ((6 — d)t). 

Therefore, cells replicate only if there is at least one empty site in the neighborhood, but die 
with rates independent of the number of empty neighbors. If a cell is completely surrounded by 
other cells, it cannot replicate. However, in simulations presented in the main text we always 
assume d > 0 so that even such “quiescent” cells can eventually resume replication if they survive 
long enough so that one of their neighbors die first and make space for their progeny. We leave 
the interesting case of d = 0 (essentially, the Eden model [60] with mutations) to be discussed 
in future work. 

Note that (i) each daughter cell receives different GAs, and that the average number of new 
GAs per daughter cell per replication is 7/2 in our notation, (ii) for small 7 (realistic situation, 
see Sec. 6.1.1), the probability of a single, new GA in any of the daughter cells is approximately 
equal to 7 . 

The simulation always begins with a single cell (and thus also a single ball of cells) at position 
X = (0,0,0). We assume that this cell already has a selective advantage over normal cells. We 
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therefore do not model the process of tumor initiation e.g. by inactivation of tumor suppressor 
genes. The initial cell begins to replicate as described above, and once two or more balls of cells 
are formed as the result of growth and short-range dispersal, the balls start to repel each other 
mechanically. This reduces the otherwise imminent and physically impossible overlap between 
the balls of cells. Because calculating forces and solving equations of motion for hundreds of 
balls every time a new cell is added to the tumor would be computationally too expensive and 
required a detailed biophysics model of the tissue, the algorithm we use to reduce the overlap is 
a less-realistic but more efficient “shoving” algorithm similar to the method used in the biofilm 
simulator iDynomics [62,63]. The algorithm works by moving apart each pair of overlapping 
balls along the line joining their centers of mass, see Fig. 5b, and is executed only if the radius 
R of any ball (measured as the distance from the center of the ball to the most distant cell) has 
increased by 5% or more compared to the last time the algorithm was used. A few iterations are 
usually necessary to relocate all overlapping balls, but because this is done rarely as compared to 
replication/death of individual cells, it does not have a huge impact on performance for typical 
scenarios we simulated (a few hundreds of balls). 

For Model B, step (2) of the algorithm is modified as follows: If cell i has no empty neighbors, 
proceed to step 5. Otherwise, pick up one of them at random and, with probability bg/hmax, create 
a new cell j. If no replication, proceed to step 5. Model B differs from A in that the replication 
probability does not depend on the number of empty (normal) cells in the neighborhood. 

For Model C, in step (2) cell i always replicates, regardless of whether there are empty sites 
nearby or not. The site to which the cell replicates is determined as follows. Ten randomly 
selected directions out of 26 possible directions are scanned until an empty site is reached, 
and the direction along which the number of occupied sites is the smallest is chosen. The 
algorithm proceeds to the first site in this direction, and the scanning is repeated. The whole 
cycle (scanning/moving by one site) is continued until an empty site is reached. Then each 
cell encountered during the procedure is shifted to the position of the next site in the chain. 
This eventually creates an empty site near cell i to which it replicates. This algorithm mimic 
mechanical displacement of cells towards the surface caused by replication inside the tumor. 
The path of “minimal drag” chosen by the algorithm is self-avoiding, i.e., when choosing the 
direction for the next step the algorithm checks if a site has been already visited and, if it has, 
this direction is discarded and another (random) one is chosen. The procedure introduces small 
artifacts related to lattice symmetries, but deviations from the expected spherical shape of the 
lesion are negligible, c.f. Fig. 12c. 

For Model D, step (5) is modified as follows: cell i dies with probability equal to its death 
rate dg multiplied by the fraction of empty nearby sites and divided by dmax- Thus, both the 
replication and death rates are proportional to the number of empty neighbors. This causes that 
cells completely surrounded by other cancer cells not only do not replicate, but also do not die, 
and form a quiescent core. 

2 Model parameters 

Growth and death rates prior to treatment. The initial neoplastic/cancer cell has the 
birth rate 6 = ln2 ~ 0.69 days“^. This corresponds to a 24h minimum doubling time, though in 
Model A this is achieved only for cells surrounded by empty space (normal cells), and most cells 
(surrounded by other tumor cells) will replicate slower on average. 

In the case of metastatic (secondary) lesions, the birth and death rates bg = b,dg = d are 
constant and do not depend on the genotype g. The death rate d assumes values between zero 
and b, depending on how aggressive we want the tumor to be. As for primary lesions, in the 
simulations presented in the main text bg = b remains the same for all cells prior to treatment 
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and is not affected by mntations. Driver mntations change only the death rate dg as follows: 
dg = b{l — , where kg is the nnmber of driver mntations in genotype g inclnding the initial 

driver mntation (hence kg > 1), and s is the selective advantage of each new driver, similarly to 
Ref. [ 8 ]. In the case of primary lesions, the rate b sets the time scale of the model. There is no 
independent parameter d for the death rate which is expressed as a fraction of b. 

We also consider the case in which dg = 6(1 — s) and bg = 6(1 + i.e. the initial cell 

replicates and dies respectively with rates 6 and 6(1 — s), and driver mntations increase the birth 
rate. As before, the only parameter that sets the time scale here is 6 , the birth rate of the initial 
cell, and the death rate d is a fnnction of 6 and s. We show below that the resnlts for this case 
are almost identical to the case from the previons paragraph where drivers rednce the rate of 
death. Conseqnently, the model behavior is determined by the difference between the birth and 
death rates, and it does not matter which of the two rates is affected by driver mntations. 

For primary tnmors we have explored a range of selective advantages s = 0.5% ... 5%. Most 
of the resnlts presented in the main text are for s = 1%. For metastatic lesions, where we do 
not model driver mntations, we take d /6 = 0.5.. .0.9, with lower death rates corresponding to 
more agressive and faster growing tnmors. This corresponds to the total selective advantage 
0.5 ... 0.1 = 50% ... 10% over normal cells. 

Mutation probabilities 7 , 7 d, 7 r' These are strictly speaking the average nnmbers of GAs 
per parental cell per replication, bnt since these nnmbers are small we can think abont them as 
the probabilities of all mntations ( 7 ), driver mutations ( 7 d), and resistant mutations ( 7 ^). The 
mutation probabilities are not known precisely and also vary among different cancers. Here we 
have used numbers that are representative for many cancers. For the total mutation probability 
we take 7 = 0.02. This value corresponds to each of ~ 6 • 10^ bp in the diploid exome mutating 
with probability ~ 3.3 • 10“^^ per bp per generation. For driver mutations, we take 7 d = 4 • 10“^ 
which is about the same as in Ref. [ 8 ]. For resistant mutations we assume that 7 r = 10“^. This 
is equivalent, for example, to about 100 potential mutations that confer resistance to any drug 
targeting a specific protein in cancer cells, each of them occurring with probability 10 which 
is an upper estimate on the probability of point mutations in somatic cells [64, 65]. The value 
of 7 r is of the same order of magnitude as estimated in Ref. [ 66 ] for targeted EGFR blockade. 

Dispersal probability M. Since no reliable in vivo data exists for this parameter, in our 
model we explore a range of values between M = and M = 10“^. Our results for the 

growth time of a metastatic lesion indicate that M = 10“^ ... 10“® gives a reasonable agreement 
with clinically observed tumor growth times, and hence we typically use values from this range. 
Note that M as defined in our model is the probability that a cell not only detaches from the 
lesion and moves to a nearby place, but that it also anchors there and starts proliferating. 

3 Tumor growth rate for the neutral model 

Figures 2, 6 show that in the absence of driver mutations (s = 0) and for M = 0 (no dispersal), 
tumors grow to a roughly spherical shape. The number of cells (and hence also the volume) 
of the tumor grows approximately as N r-^ r3, see Fig. 7. This is easy to understand - the 
inside of the tumor remains in the state of equilibrium between replication and death, and only 
the surface is able to expand and invade the surrounding tissue as the result of cancer cells’ 
net growth advantage. Because the number of cells in the surface layer of a spherical tumor is 
proportional to the rate of change of the total number of cells is ~ which 

gives N{t) = At^. The proportionality coefficient A depends on the thickness of the outermost 
layer of cells that is not at equilibrium and on the net growth rate of cells in this layer, and in 
general is a complicated function of 6 and d which we could not determine analytically. 

For M > 0, however, the model exhibits a transition to exponential growth when the size N 
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becomes much larger than 1/M, see Fig. 2. This can be explained as follows. Let N{n,t) be the 
number of balls of cells having n cells at time t. Since in our model the balls do not interact 
with one another (besides mechanical repulsion) and their growth is unaffected by the presence 
of other balls, we can write the following Master equation for the rate of change of N(n,t) in 
time: 

= 6n,iM ^ B{k)N{k, t) + (1 - M)B{n - l)N{n - 1, t) - (1 - M)B{n)N{n, t), (1) 

k=l 

where B{n) is the rate at which a ball of size n grows to size n + 1. The first term in Eq. Q 
proportional to M accounts for the production of new balls of cells due to dispersal. This 
happens with the rate equal to the total rate of replication, hence the sum over balls of all sizes 
k = 1,..., oo weighted by N(k, t), the number of balls of size k. The second term is the gain 
term due to replication in a ball of size n — 1 which creates a ball of size n, and the last term is 
the loss term due to a ball of size n increasing its size by one cell. 

If we knew B{n) exactly, we could in principle solve Eq. Q and find the total size N{t) = 
as a function of time. Unfortunately, it is not easy to determine B{n) analytically 
for arbitrary n because of the vast number (and the lack of regularity of) configurations that 
cells can take in a single ball of size n ^ 1. This problem was identified many years ago in the 
Eden model [60], and it has no known exact solution for large n. However, here we are only 
interested in the asymptotic behavior of B(n) for large sizes n because we anticipate that only 
this large-n behavior will affect the tumor growth rate at large times. From our earlier results 
for M = 0 we conclude that B(n) ~ for n —)• oo and although finite-size corrections to this 
formula can be quite strong, we shall neglect them here for the sake of simplicity. 

Let us now discuss the large-t solution of Eq. 0 . We shall keep a general form of B{n) 
and replace it by B(n) ~ in the final step. Equation 0 is a linear differential equation in 
N(n, t) and therefore its long-time behavior must be dominated by 

N{n,t) = (2) 


where Hexp plays the role of the largest eigenvalue of the linear operator acting on N (n, t) on 
the r.h.s. of Eq. Q. Its biological interpretation is that of the exponential growth rate of the 
whole tumor, 

Af(t) = ^nA^(n, t) = ^np(n), (3) 

n n 

and p{n) is now a time-independent, stationary distribution of ball sizes. One could think that 
we have just shown that N{t) grows exponentially, but this is not true; it remains to be seen 
that Hexp = const > 0 for M > 0, otherwise sub-leading terms could lead to a sub-exponential 
(i.e. power-law as we have seen above) growth. 

To proceed, we insert Eq. @ into Eq. Q and obtain the following recursion relation for p{n): 

Be^pP{n) = Sn,iM '^^p{k)B{k) -|- (1 — M)B{n — l)p{n — 1) — (1 — M)B{n)p{n). (4) 

k 


For n = 1 this reduces to 


whereas for n > 1 we can 


p{l) = M 


J2kPik)B{k) 

Hexp + (1 - M)H(l) ’ 


rewrite Eq. Q and express p{n) as a function of p{n 
B{n-l){l- M) 


p{n) = p{n — 1 ) 


Hexp + B{n){l - M)' 


(5) 

1 ) as follows 

( 6 ) 
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Iterating the above equation we obtain p{n) as a product of n fractions as in the above equation, 
times p(l). If we now use this to expand have that 


Y.p{k)B[k) 

k 


oo 

l + Y,B{k) 

k=2 


B{n-l){l-M) \ 


EkPik)B{k) 

B,^p + {l-M)B{iy 


( 7 ) 


and after dividing both sides by the common factor 'Yl,kP{k)B{k) we obtain that M obeys the 
following equation: 


1 

M 


1 

Bexp + 5(1)(1-M) 


i + Y,m 

k=2 


(A B{n-l){l-M) \ 
[i}^Be.p + B{n){l-M)J 


( 8 ) 


The above equation allows us to calculate i?exp as a function of the dispersal probability M, 
albeit in general it must be solved numerically. However, if we now use the asymptotic form 
B{n) ~ upon inserting it into Eq. ([^ and canceling lower-order terms in M we have 

~ Ml/3 (9) 


for M <C 1. This concludes our proof that the rate of exponential growth is non-zero for M > 0. 
Equation ([^ also predicts that Bexp is proportional to the cubic root of the dispersal probability 
M. Numerical simulations (results not shown) indicate that, for M > 10“®, the actual Hgxp 
increase slower with M than ~ approximately as JV^0.25...0.3 jg caused by deviations 

from the law B{n) ~ n^/3 for balls smaller than ~ cells, for which B{n) is better approximated 
by ~ nO-7-0-75. 


4 Reseeding and long-range migration 

In our model each cell that migrates from its original site starts a new ball of cells and there 
are no interactions (besides mechanical repulsion) between balls of cells. Thus, if we neglect 
spatial distribution of genetic alterations (GAs) and focus only on the total number of GAs and 
whole-tumor growth, our algorithm can also model long-range metastasis and reseeding of new 
lesions. All our predictions except those related to the spatial distribution of GAs remain valid 
for the reseeding model. For example, we have shown that, with short-range dispersal, tumor 
growth becomes exponential with rate ~ M^/3, where M is the dispersal probability. If we also 
include the possibility of reseeding (long-range migration) with probability R, the total mass will 
grow at an increased rate (M -F i?)i/3. 

5 Accumulation of driver mutations 

Figure lla-c suggests that the number of drivers per cell increases linearly in time. We have 
made simulations of larger tumors (up to 10® cells), and different s, and although the growth 
rate increases in time, the drivers accumulate approximately at a constant rate (data not shown). 
This is in contrast to what has been shown in. Ref. [8] for exponentially growing tumors without 
any spatial structure, where the number of drivers increases exponentially in time. 

This counter-intuitive outcome is the result of the spatial structure of the tumor and all 
driver mutations having the same selective advantage s in our model. If s is small, a new driver 
arising in the background of the previous driver mutations spreads within the population with 
the same speed as the previous drivers, equal to the speed with which the tumor expands into 
the normal tissue. This expansion, which is similar to travelling fronts in other spatial models. 
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can be demonstrated mathematically. Onr statement remains trne for snfficiently large tnmors 
in which most cells are inside the ball and not on the snrface. Therefore, apart from first few 
drivers, the i-th clonal snbpopnlation follows the same growth law as the (i — l)-th clone bnt it 
starts at a later time ti, i.e. its linear extension (“diameter”) li{t) = l{t — U) where the fnnction 
l{t) is approximately the same for all clones. Moreover, the times ti at which consecntive driver 
mntations arise are approximately eqnally spaced in time. This is becanse the rate at which 
new drivers are prodnced increases very slowly with i. This symmetry npon time translation 
canses linear accumnlation of driver mutations over time. We plan to investigate this effect in a 
subsequent publication. 

6 Alternative simulation method: kinetic Monte Carlo (KMC) 
algorithm 

Although we talk about rates of different processes, our simulation algorithm is not an exact 
KMC such as e.g. the Gillespie algorithm. The reason we did not use the standard algorithm 
was twofold. First, real cells do not reproduce fully stochastically. Regardless of what algorithm 
is used, a stochastic model will never be fully realistic. Second, a fully stochastic simulation 
would be much slower than our algorithm, even when more advanced methods than the Gillespie 
algorithm were used. 

To check how sensitive are the results of our modelling to the algorithm used, we simulated 
the model using a KMC, with cells undergoing independent stochastic birth/death events with 
rates b and d. Figure 14 shows that there is only a small difference between the two algorithms. 
One important drawback (except low speed) of the KMC algorithm is that zero net growth rate 
does no longer correspond to d = b, but to d ~ 0.8976 because of a non-trivial dependence of 
the replication rate on the local density of cells. Consecutively, if s is to be interpreted as the 
selective advantage, the initial (6 — d)/d (the selective advantage of the first driver) used in the 
KMC algorithm must be multiplied by 0.897 to make the results comparable to our algorithm. 
This, and a substantial reduction in the simulation speed, caused that we used the non-KMC 
algorithm. 
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